Introduction: This analysis is based on the outputs of pairwise comparisons of differential gene expression generated by this template. It uses results from 3 pairwise comparisons of 3 sample groups vs. their corresponding control groups and compares how these 3 sample groups are different from each other in terms of their sample-control differences (delta-delta). An example of such analysis is the different responses of 3 cell types to the treatment of the same drug. This analysis is focused on the overlapping of differentially expression at both gene and gene set levels.
Â
Transcriptome in immune cells of control-patient samples
Rna-seq data was generated from of 3 types of immune cells of 3 controls and 3 patients. Raw data was processed to get gene-level read counts. Pairwise comparisons were performed between controls and patients in each immune cell.
This report compares the results of the following pairwise comparisons.
Both comparisons reported the log ratio of sample and control group means for each gene. The global agreement of log ratios of all genes indicates how much the results of these 3 comparisons are similar to or different from each other. Full table of gene-level statistics side-by-side is here.
Both comparisons identified DEGs from 2 compared groups. Check report of individual comparisons for how the DEGs were selected. Overlapped DEGs identified by all 3 comparisons are worthy of a closer look.
| Total_gene | P < 0.01 | DEG, > control | DEG, < control | |
|---|---|---|---|---|
| B_Cell | 23272 | 3235 | 984 | 1853 |
| T_Cell | 23272 | 2130 | 693 | 671 |
| Monocyte | 23272 | 1693 | 501 | 465 |
Figure 2A. Overlapping of DEGs, higher expression comparing to control groups. Click links to view overlapping genes:
Figure 2B. Overlapping of DEGs, lower expression comparing to control groups. Click links to view overlapping genes:
2-way ANOVA analysis was performed to identify genes responding to SLE differently in different Cell. The analysis reported 3 p values, corresponding to the effect of SLE, Cell, and their interaction. The analysis identified 1513 significant genes with interaction p values less than 0.01. The full ANOVA results were summarized in a table here.
Genes are often grouped into pre-defined gene sets according to their function, interaction, location, etc. Analysis then can be performed on genes in the same gene set as a unit instead of individual genes.
Average differential expression of genes in the same gene set. The gene set-level mean of log-ratio were summarized in this table here.
Figure 4. Each dot represents a gene set and the average log-ratio of all genes in this gene set. The same 3D plot was showed in 2 different angles.
Each 2-group comparison performs gene set over-representation analysis (ORA) that identifies gene sets over-represented with differentially expressed genes. The results of ORA of both 2-group comparisons are summarized and compared here. The ORA of each gene set reports an odds ratio and p value. These statistics from both comparisons were combined and listed side-by-side, as well as the difference of their odds ratios and ratio of their p values (p set to 0.5 when not available), in this table here
| B_Cell::Higher_in_Control | B_Cell::Higher_in_SLE | T_Cell::Higher_in_Control | T_Cell::Higher_in_SLE | Monocyte::Higher_in_Control | Monocyte::Higher_in_SLE | |
|---|---|---|---|---|---|---|
| BioSystems | 438 | 3212 | 921 | 479 | 684 | 1095 |
| KEGG | 40 | 319 | 49 | 119 | 48 | 164 |
| MSigDb | 857 | 4125 | 1565 | 686 | 1024 | 2024 |
| OMIM | 0 | 1 | 0 | 0 | 0 | 0 |
| PubTator | 123 | 7634 | 632 | 892 | 1726 | 1953 |
Figure 5A. The overlapping of over-represented gene sets by up-regulated genes in all 3 comparisons. Click links to view overlapping gene sets:
Figure 5B. The overlapping of over-represented gene sets by down-regulated genes in all 3 comparisons. Click links to view overlapping gene sets:
Each 2-group comparison performs gene set enrichment analysis (GSEA) on genes ranked by their differential expression. The results of GSEA of both 2-group comparisons are summarized and compared here. The GSEA of each gene set reports an enrichment score and p value. These statistics from both comparisons were combined and listed side-by-side in this table here
| B_Cell::Higher_in_Control | B_Cell::Higher_in_SLE | T_Cell::Higher_in_Control | T_Cell::Higher_in_SLE | Monocyte::Higher_in_Control | Monocyte::Higher_in_SLE | |
|---|---|---|---|---|---|---|
| C0_Hallmark | 2 | 37 | 12 | 3 | 4 | 15 |
| C1_Positional | 13 | 26 | 28 | 10 | 19 | 19 |
| C2_BioCarta_Pathways | 1 | 68 | 14 | 2 | 0 | 20 |
| C2_Chemical_and_genetic_perturbations | 36 | 1356 | 411 | 122 | 107 | 353 |
| C3_MicroRNA_targets | 0 | 51 | 3 | 5 | 4 | 4 |
| C3_TF_targets | 4 | 284 | 12 | 89 | 10 | 85 |
| C4_Cancer_gene_neighborhoods | 42 | 86 | 170 | 17 | 34 | 57 |
| C4_Cancer_modules | 10 | 176 | 78 | 18 | 28 | 65 |
| C6_Oncogenic_signatures | 2 | 116 | 9 | 17 | 20 | 20 |
| C7_Immunologic_signatures | 58 | 922 | 432 | 52 | 44 | 381 |
| GO_BP | 145 | 2065 | 437 | 239 | 321 | 498 |
| GO_CC | 67 | 159 | 120 | 34 | 37 | 50 |
| GO_MF | 44 | 359 | 87 | 71 | 73 | 121 |
| KEGG_compound | 4 | 126 | 41 | 47 | 44 | 84 |
| KEGG_enzyme | 1 | 1 | 2 | 3 | 2 | 1 |
| KEGG_module | 11 | 13 | 24 | 3 | 5 | 10 |
| KEGG_pathway | 9 | 161 | 27 | 25 | 7 | 57 |
| KEGG_reaction | 2 | 35 | 24 | 23 | 23 | 27 |
| OMIM_gene | 1 | 2 | 2 | 2 | 0 | 1 |
| REACTOME | 92 | 283 | 230 | 63 | 46 | 134 |
| WikiPathways | 2 | 91 | 13 | 5 | 9 | 21 |
Figure 6. This plot shows the global correlation (correlation coefficient = 0.361, 0.2, 0.189) of nominal enrichment scores between the 3 pairwise comparisons: B_Cell, T_Cell, and Monocyte. The same 3D plot was showed in 2 different angles. Gene sets obtained p values less than 0.01 from any 1, any 2, or all 3 comparisons were highlighted in yellow, orange, or red respectively. The correlatio coefficients between enrichment scores of each pair of comparisons are:
Figure 7A. The overlapping of enriched gene sets by up-regulated genes in all 3 comparisons. Click links to view overlapping gene sets:
Figure 7B. The overlapping of enriched gene sets by down-regulated genes in all 3 comparisons. Click links to view overlapping gene sets:
The top 1200 genes with significant ANOVA p values (p <= ‘r prms\(geneset\)cluster$panova’) were used as seeds to perform a gene-gene clustering analysis and 12 clusters were identified. ORA was performed on the clusters to identify their functional association (see table below);
| ID | Size | B_Cell::Control | B_Cell::SLE | T_Cell::Control | T_Cell::SLE | Monocyte::Control | Monocyte::SLE | Gene_set |
|---|---|---|---|---|---|---|---|---|
| Cluster_1 | 108 | 0 | 1.5435 | 0 | -1.3742 | 0 | 1.1953 | 609 |
| Cluster_2 | 98 | 0 | 1.3704 | 0 | -1.4543 | 0 | -1.4314 | 2164 |
| Cluster_3 | 480 | 0 | 1.6840 | 0 | 1.5731 | 0 | 1.6065 | 1439 |
| Cluster_4 | 123 | 0 | 1.6020 | 0 | 1.4107 | 0 | -1.2534 | 1069 |
| Cluster_5 | 303 | 0 | -1.6430 | 0 | -1.6402 | 0 | -1.5918 | 2441 |
| Cluster_6 | 18 | 0 | 0.3060 | 0 | 1.5256 | 0 | -1.3006 | 639 |
| Cluster_7 | 122 | 0 | 1.5202 | 0 | 1.6649 | 0 | 0.6400 | 1108 |
| Cluster_8 | 65 | 0 | -1.3774 | 0 | 1.3512 | 0 | -1.4642 | 1862 |
| Cluster_9 | 22 | 0 | -1.4537 | 0 | 1.5435 | 0 | 0.1158 | 938 |
| Cluster_10 | 63 | 0 | -1.5032 | 0 | -1.3938 | 0 | 1.4290 | 1424 |
| Cluster_11 | 58 | 0 | -1.3088 | 0 | 1.0172 | 0 | 1.5528 | 2119 |
| Cluster_12 | 12 | 0 | 0.0941 | 0 | -1.4782 | 0 | 1.4278 | 308 |
Check out the RoCA home page for more information.
To reproduce this report:
Find the data analysis template you want to use and an example of its pairing YAML file here and download the YAML example to your working directory
To generate a new report using your own input data and parameter, edit the following items in the YAML file:
Run the code below within R Console or RStudio, preferablly with a new R session:
if (!require(devtools)) { install.packages('devtools'); require(devtools); }
if (!require(RCurl)) { install.packages('RCurl'); require(RCurl); }
if (!require(RoCA)) { install_github('zhezhangsh/RoCAR'); require(RoCA); }
CreateReport(filename.yaml); # filename.yaml is the YAML file you just downloaded and edited for your analysis
If there is no complaint, go to the output folder and open the index.html file to view report.
## R version 3.2.2 (2015-08-14)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X 10.10.5 (Yosemite)
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] DEGandMore_0.0.0.9000 snow_0.4-1 rchive_0.0.0.9000
## [4] VennDiagram_1.6.17 futile.logger_1.4.1 scatterplot3d_0.3-37
## [7] gplots_3.0.1 MASS_7.3-45 htmlwidgets_0.6
## [10] DT_0.1 awsomics_0.0.0.9000 yaml_2.1.13
## [13] rmarkdown_0.9.6 knitr_1.13 RoCA_0.0.0.9000
## [16] RCurl_1.95-4.8 bitops_1.0-6 devtools_1.12.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.5 magrittr_1.5 highr_0.6
## [4] stringr_1.0.0 caTools_1.17.1 tools_3.2.2
## [7] parallel_3.2.2 KernSmooth_2.23-15 lambda.r_1.1.7
## [10] withr_1.0.2 htmltools_0.3.5 gtools_3.5.0
## [13] digest_0.6.9 formatR_1.4 futile.options_1.0.0
## [16] memoise_1.0.0 evaluate_0.9 gdata_2.17.0
## [19] stringi_1.1.1 jsonlite_0.9.22
END OF DOCUMENT